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Abstract 

This paper proposes generalization of the popular realized volatility framework by allowing its 
^ measurement in the time-frequency domain and bringing robustness to both noise as well as jumps. 

O Based on the generalization of Fan and Wang ( 2007 ) approach using smooth wavelets and Maximum 

^ Overlap Discrete Wavelet Transform, we present new, general theory for wavelet decomposition of 

integrated variance. Using wavelets, we not only gain decomposition of the realized variance into 
^ several investment horizons, but we are also able to estimate the jumps consistently. Basing our 

I 1 estimator in the two-scale realized variance framework of Zhang et al. ( 2005 ) , we are able to utilize 

all available data and get unbiased estimator in the presence of noise as well. The theory is also 
^ tested in a large numerical study of the small sample performance of the estimators and compared 

to other popular realized variation estimators under different simulation settings with changing 
^ noise as well as jump level. The results reveal that our wavelet-based estimator is able to estimate 

and forecast the realized measures with the greatest precision. Another notable contribution lies 
in the application of the presented theory. Our time-frequency estimators not only produce more 
^ efficient estimates, but also decompose the realized variation into arbitrarily chosen investment 

^ horizons. The results thus provide a better understanding of the dynamics of stock markets. 

in 

00 

p<| 1. Introduction 

o 

CN Volatility of asset returns has become one of the primary concerns in financial econometrics 

^T^. research over the past decade. The increasingly popular Realized Volatility approach was pioneering 

. ^ work which took advantage of the data in a nonparametric fashion, but as both theoretical insights 

^ and data availability have grown rapidly in the past decade, the research in Realized Volatility has 

^ brought great improvement in the volatility estimation. The most fundamental result in realized 

variation states that it provides a consistent nonparametric estimate of price variability over a 
given time interval. The formalized theory is presented by Andersen et al. (2003). While authors 
provide a unified framework for modeling, Zhou (1996) was one of the first to provide a formal 
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assessment of the relationship between cumulative squared intraday returns and the underlying 
return variance. The pioneering work by Olsen & Associates on the use of high-frequency data, 
summarized by Dacorogna et al. (2001), produced milestone results for many of the more recent 
empirical developments in realized variation. A vast quantity of literature on several aspects of 
estimating volatility has emerged in the wake of these fundamental contributions. 

Our work builds on the popular Realized Volatility approach, bringing even more insights to the 
theory. While most time series models are set in the time domain, we enrich the analysis by the 
frequency domain. This is enabled by the use of the continuous wavelet transform. It is a logical 
step to take, as the stock markets are believed to be driven by heterogeneous investment horizons. 
In our work, we ask if wavelet decomposition can improve our understanding of volatility series 
and hence improve volatility forecasting and risk management. 

The usage of wavelets is motivated by their very appealing feature that they can be embedded 
into stochastic processes, as shown by Antoniou and Gustafson (1999). Thus we can conveniently 
use them to extend the theory of quadratic variation. One of the issues with the interpretation 
of wavelets in economic applications is that they behave like a filter. Thus wavelets can hardly 
be used for forecasting in econometrics. But in the realized measures, we use wavelets decompose 
the daily variation of the returns using intraday information. By computing realized measures on 
the wavelet coefficients of the high frequency data, this problem vanishes for the daily volatility 
forecasts. Moreover, the approach suggests constructing a model from the wavelet decomposition. 

We are not the first to use this idea. Several attempts to use wavelets in the estimation of 
realized variation have emerged in the past few years. H0 g and Lunde (2003) were the first to 
suggest a wavelet estimator of realized variance. Capobianco (2004), for example, proposes to use a 
wavelet transform as a comparable estimator of quadratic variation. Subbotin (2008) uses wavelets 
to decompose volatility into a multi-horizon scale. Next, Nielsen and Frederiksen (2008) compare 
the finite sample properties of three integrated variance estimators, i.e., realized variance, Fourier 
and wavelet estimators. They consider several processes generating time scries with a long memory, 
jump processes as well as bid-ask bounce. Gcncay ct al. (2010) mention the possible use of wavelet 
multiresolution analysis to decompose realized variance in their paper, while they concentrate on 
developing much more complicated structures of variance modeling in different regimes through 
wavelet-domain hidden Markov models. 

One remarkable exception which fully completes the current literature on using wavelets in 
realized variation measurement is the work of Fan and Wang (2007), who were the first to use the 
wavelet-based realized variance estimator and also the methodology for the estimation of jumps 
from the data. In our work, we generalize the results of Fan and Wang (2007) in several ways. 
Instead of using the Discrete Wavelet Transform we use the Maximum Overlap Discrete Wavelet 
Transform, which is a more efficient estimator and is not restricted to sample sizes that are powers of 
two. We also use smooth wavelets, specifically the Daubechies family of wavelets instead of the Haar 
type. Finally, we propose a general framework for the quadratic variation theory decomposition 
using wavelets in the continuous setting. 

An important theoretical contribution of this paper is wavelet decomposition of integrated vari- 
ance. Generalization of the martingale representation theorem to wavelet representation theorem 
gives us the power to decompose the return processes into several investment horizons in contin- 
uous time. We show that asymptotically, the wavelet decomposition is the same as the realized 
volatility estimator. Moreover, we use wavelets for jump detection. Connecting it with the result 
of Zhang et al. (2005), who introduced a two-scale realized volatility estimator robust to noise 
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into the literature, we arrive at an estimator which is robust to both jumps and noise. Based on 
this result, we present a complete wavelet-based realized variation theory generalizing the realized 
measures. To study the small sample behavior we run a large numerical study showing that the 
asymptotic property holds under various settings, and our wavelet-based estimator also proves to 
have the lowest forecast bias. 

After the theory is derived, we apply our estimator to the modeling of currency futures volatility. 
By studying the statistical properties of unconditional daily log-return distributions standardized 
by volatility estimated using the different estimators we find that standardization by our wavelet- 
based estimator brings the returns close to the Gaussian normal distribution. All the other esti- 
mators are affected by the presence of jumps or noise in the data to some extent. The differences 
are economically significant, as we find that the average volatility estimated using our wavelet- 
based theory is 6.34% lower than the volatility estimated with the standard estimator. Moreover, 
wavelet-based estimator allows us to decompose the volatility series into a jump component and 
several investment horizons. The decomposition of empirical volatility reveals interesting results. 
Most of the volatility in the stock markets comes from high frequencies. 

Organization of the paper is as follows. The second section introduces our theory for wavelet 
decomposition of integrated variance including wavelet representation theorem. The third section 
briefly introduces the standard realized variance measurement. Based on these results, the fourth 
section derives wavelet-based realized variance estimator and its properties. The fifth section 
tests the theory in a numerical study and compares the small sample behavior of the wavelet- 
based estimator with other popular estimators, while assuming different processes driving the 
stock market with different amounts of noise and jumps. Specifically, we consider jump-diffusion 
stochastic volatility and fractional stochastic volatility. The section concludes with a numerical 
study assessing the forecasting performance of the estimators. The last chapter of this part applies 
the presented theory, decomposes the empirical volatility of forex stock markets. 

2. Main theoretical result: Wavelet decomposition of integrated variance 

2.1. General setup and notation 

We begin with a short description of the framework used for studying processes in a continuous- 
time no-arbitrage setting. Consider a univariate risky logarithmic asset price process pt defined 
on a complete probability space {0,,T,F). The price process evolves in continuous time over 
the interval [0,T], where T is a finite positive integer. Further, consider the natural information 
filtration, an increasing family of cr-fields iJ't)te[o,T] ^ which satisfies the usual conditions. 
Following Andersen et al. (2003), we define the continuously compounded asset return over the 
[t — h,t\ time interval, 0<h<t<T,hy rt,h = Pt — Pt-h- A special case of the continuously 
compounded return, the cumulative return process from f = up to time t, rt = {'r't)t&[o,T]j is then 
rt = rt^t = Pt — Po and inherits all the main properties of pt- These definitions imply a simple 
relation between the period-by-period and the cumulative returns that we use repeatedly in the 
further text: rt^h = ft — ft-h-, < h <t <T. 

A fundamental result of stochastic integration theory states that such processes permit a unique 
canonical decomposition (e.g. Protter, 1992). Hence instantaneous return rt, can thus be uniquely 
decomposed into a predictable and integrable mean (expected return) component and a local mar- 
tingale innovation. While the integral representations for continuous sample path semi-martingales 
are rather abstract, the continuous-time models in the theoretical asset and derivatives pricing lit- 
erature are frequently assumed to have continuous sample paths with the corresponding diffusion 
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processes given in the form of stochastic differential equations (SDE henceforth). This assumption 
can be made using the following result without loss of generality (Protter, 1992). For any univari- 
ate, square-integrable, continuous sample path, logarithmic price process iPt)t£[oT] which is not 
locally riskless, there exists a representation such that over [t — h,t\, ioi all < h < t < T 



n,h= / l^sds+ / GsdWs, (1) 

Jt-h Jt-h 

where fig is an integrable, predictable and finite- variation stochastic process, ag is a strictly positive 
cadlag stochastic process satisfying 



P 



and Wt is a standard Brownian motion. 



/ a'^ds < oo 
Jt-h 



2.2. Wavelet representation theorem 

Wavelets and certain stochastic processes have a common structure. In fact, wavelet theory 
may be embedded in stochastic processes, as shown by Antoniou and Gustafson (1999), who 
compare wavelets with martingales and stochastic processes. In our analysis, we use Daubechies 
compactly supported wavelets. The original construction was first published in Daubechies (1988), 
while a detailed discussion about the Daubechies type of wavelets can be found in Daubechies 
(1992). The advantage of using Daubechies family filters is that they improve the frequency- 
domain characteristics of the Haar wavelet, but it can still be interpreted as generalized differences 
of adjacent averages. For more details see Daubechies (1988), Daubechies (1992) and Gengay et al. 
(2002). We also provide a brief introduction to Daubechies family wavelets in Appendix A.l. 

For our analysis, we need to define the continuous wavelet transform (Daubechies, 1988) first: 

Definition 1. Continuous wavelet transform 
If tp E (M) satisfies the admissibility condition 



/■ , 2 1 

:= / V'(s) ri'^s < +00, (2) 



where " denotes the Fourier transform, then tp is called a basic wavelet. Relative to every basic 
wavelet ip, the continuous (integral) transform on L2(m) is defined by 



{W^mj,k) = {^j,k,f) =1 J \-'/' [ ^ (—)f{s)ds f G L\R), (3) 

where (., .) defines the L"^ -inner product and j,k G M with j ^ 0. 

Next we introduce the Calderon reconstruction formula (Chui, 1992). 
Proposition 1. Calderon reconstruction formula 

Let ip G L^(M) be a basic wavelet which defines a continuous wavelet transform (W^/) (j, k). Then 
for any f G L^(M) and s G M ai which f is continuous, 

m = f f {w^m,k)i^jM^)Ukdj. (4) 



Furthermore, let ip satisfy the extra conditions 



r+oo 

Jo 



2 1 



ds 



2 1 1 

-ds = -Cw,. 
s 2 



Then 



(5) 
(6) 



-'0 Um J J 

/or any f G L^(M) anci s G M ai which f is continuous. 

For the proof, see Chui (1992). 



The admissibihty condition ensures that the Fourier transform of the wavelet ip{s) has sufficient 
decay as s — )• (Daubechies, 1988). The finiteness of is guaranteed if V'(O) = 0, which is 
equivalent to zero mean of the wavelet (Mallat, 1998), 



/oo 
'il;{s)ds = 0. 
-oo 

Further, we impose the unit energy condition on the wavelet 



(sTds = 1 



(7) 



(8) 



Conditions 7 and 8 ensure that the wavelet has some non-zero terms, but all excursions away 
from zero must cancel out. 



Theorem 1. Let ^ be a Daubechies wavelet function D4- Then the extra conditions in Proposition 
1 



r+oo 

Jo 



2 1 



ds 



2 1 



ds 



1 



(9) 



are satisfied. 

The proof is provided in Appendix A. 3, 

For more details about the continuous wavelet transform and the Calderon reconstruction for- 
mula see Mallat (1998), Calderon (1964), Mallat (1998), Daubechies (1988). 

Based on the proposed theory, we are able to extend the martingale representation theorem 
(Eq. 1 ) using the continuous wavelet transform. Following theorem will allow us to use the wavelet 
theory for the integrated variance decomposition. 

Proposition 2. Wavelet representation theorem 

For any univariate, square-integrable, continuous sample path, logarithmic price process {pt)t£[oT] 
which is not locally riskless, there exists a representation which can be decomposed using wavelets 
such that for alio <h<t<T 



rt,h 



fit.h + Mt,h= [ fJ.sds+ [ asdWs (10) 

Jt-h Jt-h 
rt roo r 12 /"OO r 

/ / / '^j,k{^){'^j,k,tJ's)dk^djds + — / / / tpj,kis){'(pj,k,crs)dk^djdWs, 
Jt-h Jo Jm. J W Jt-h Jo Jm. J 
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where is an integrable, predictable and finite-variation stochastic process and as is a strictly pos- 
itive cadlag stochastic process satisfying P Jf_fi(^s^s < oo =1. -0^^^ G L^(M) fl L^(M) represents 
the Daubechies (D4) wavelet function with a compact support. 

The proof is provided in the Appendix A.4. 

2.3. Decom,position of quadratic return variation 

Generally, we assume that the latent logarithmic asset price follows a standard jump-diffusion 
process and is contamined with microstructure noise. 

Proposition 3. Let {yt)t^pT\ observed log prices, which will be equal to the latent, so-called 

"true log-price process", dpt = jitdt + afdWt + (,tdqt,, < t < T , and will contain microstructure 
noise et 

yt=Pt + ^t, (11) 

where et is zero mean i.i.d. noise with variance rf , q is a Poisson process uncorrelated with W and 
governed by the constant jump intensity X. The magnitude of the jump in the return process is 
controlled by factor ~ -^(^, c|) 

The main object of interest in financial econometrics is the estimated integrated variance of the 
latent price process, {p,p)f. = a^dt. Quadratic return variation over the [t — h, t] time interval, 
0<h<t<T, 

QVt,h= [ CT^sds+ Yl (12) 

iyt,h Jump Var. 
can be decomposed with a Daubechies -D(4) wavelet as: 

2 ft r-QO r- 1 

QVt,h = -^ / H^j,k{sm,k,cTi)dk^djds+ Yl Js, (13) 

^'^Jt-hJo Jr J _ t-h<s<t 



IVuh 



Jump Var. 

where 

{^j,k.<^l) = I ^ {—\l{s)ds (14) 

using wavelet representation theorem from Proposition 2 , 

Based on Proposition 2, a modcl-frcc measure of the integrated variation part may be proposed 
in analogy to the simple realized variance estimator. In order to be able to define the estimator, 
we need to define the tools for the discrete estimation of integrated variance and discrete wavelet 
transform first as quadratic variation is not directly observable. 
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3. Realized variance measurement 



Before we continue with wavelet estimators, it will be useful to briefly describe the common 
quadratic variation measurement. The estimate of the integrated variance {p,p)f will always be 

denoted as Rv[ ^ in the further text, where M will be replaced by the abbreviation of the specific 
estimator used. 

Recently popularized measure is called realized variance and it is consistent and unbiased 
estimator of the quadratic variation (Eq. 12) if the sampling goes to infinity (Andersen and 
Bollerslev, 1998; Andersen et al., 2001, 2003; Barndorff-Nielsen and Shephard, 2001, 2002a b). 
The realized variance over [t — h,t], for < h < t < T , is defined by 



where n is the number of observations in [t — h,t]. The problem of this estimator is that we are 
basically interested in estimation of the {p,p)f not whole quadratic variation and also we can not 
maintain consistency as realized variance estimator will get biased in the presence of microstructure 
noise. 

Zhang et al. (2005) propose the solution to the problem of noise by introducing the Two-Scale 
Realized Volatility (TSRV henceforth) estimator. Another estimator, which is able to deal with 
the noise and which we use for the comparison in our study is the realized kernels (RK) estimator 
introduced by Barndorff-Nielsen et al. (2008). Finally, Barndorff-Nielsen and Shephard (2004, 2006) 
develop a very powerful and complete way of detecting the presence of jumps in high-frequency 
data, bipolar variation. The basic idea is to compare two measures of the integrated variance, one 
containing the jump variation and the other being robust to jumps and hence containing only the 
integrated variation part. In our work, we use the Andersen ct al. (2011) adjustment of the original 
Barndorff-Nielsen and Shephard (2004) estimator, which helps render it robust to certain types of 
microstructure noise. As these estimators will be used for comparison in the small sample study, 
we do not define them here not to distract the reader from the main text. 

4. Estimation of the realized variance using wavelets 

The continuous wavelet transform is a very important concept which helps us with the derivation 
of theoretical behavior on the time-scale space. Since we work with real data, we need some form 
of sampling to compute the estimators, i.e., we have to use a suitable form of discretization. We 
will drop time for the explanation of the wavelet decomposition of variance. As intraday returns 
will be decomposed, we define Xj = p._,,i — P4-_j^,i+i for all j = 1, . . . , n — 1. The next sections 

n n 

discuss a special form of discrete wavelet transformation, so from this point on we restrict the scale 
j and the translation k parameters to integers only. 

4-1- The maximal overlap discrete wavelet transform 

The maximal overlap discrete wavelet transformation (MODWT) is a translation-invariant 
type of discrete wavelet transformation, i.e., it is not sensitive to the choice of starting point of the 
examined process. Furthermore, the MODWT does not use a downsampling procedure as in the 



n 




(15) 
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case of the discrete wavelet transform ^ (DWT), so the wavelet and scaling coefficient vectors at all 
levels (scales) have equal length. As a consequence, the MODWT is not restricted to sample sizes 
that are powers of two. This feature is very important for the analysis of real market data, since 
this limitation is usually too restrictive. 

Both the DWT and the MODWT wavelet and scaling coefficients can be used for energy 
decomposition and analysis of variance. In the literature the MODWT is also called the stationary 
wavelet transform, the translation invariant transform and the undecimated wavelet transform. 
For more details about the MODWT see Mallat (1998), Percival and Walden (2000) and Gengay 
et al. (2002). 

The MODWT is a very convenient tool for variance and energy analysis of a time series in the 
time-frequency domain. Percival (1995) demonstrates the advantages of the MODWT estimator 
of variance over the DWT estimator. Serroukh et al. (2000) analyze the statistical properties of 
the MODWT variance estimator for non-stationary and non-Gaussian processes. 

4.2. Definition of MODWT filters 

First, let us introduce the notion of MODWT scaling and wavelet filters on the j'-th level, gj^i 
and hj^i, as rescaled scaling and wavelet filters used in a simple DWT, i.e., gj^i = and 
hj^i = Both the scaling and wavelet filters have the same width, that is similar for the 

DWT filters as well: 

L,- = (2^-l)(L-l) + l, (16) 

where L denotes the length of a basic wavelet. For example, the Daubcchics D(4) wavelet filter 
has length L = 4. There are three basic properties that both the MODWT filters must fulfill. Let 
us show these properties on the first level, j = I: 

L-1 L-1 oo 

J2hi = 0, ^ /if = 1/2, J2 hihi+2n = 0, neZN, (17) 

1=0 1=0 l=-oo 

and for the MODWT scaling filter: 

L-1 L-1 oo 

^ 5i = 1, J2 ~9i = 1/2, 9l9l+2n = 0, n G Z^. (18) 

1=0 1=0 l=-oo 

Let's consider a time series (Xj)jg[o,7V-i] of intraday returns. We obtain the MODWT wavelet 
and scaling coefficients oni = 0,...,iV — 1 via circular filtering with the j-th level MODWT wavelet 
and scaling filters: 

= X/ ^3,l^i-lmodN , (19) 
1=0 
Li-1 

= ^ 9j,lXi-lmodN- (20) 
1=0 



^For a definition and detailed discussion of the discrete wavelet transform see Mallat (1998), Percival and Walden 
(2000) and Gengay et al. (2002) 



8 



With periodization of the MODWT wavelet and scahng filters to length N we can write (Percival 
and Walden, 2000)): 



7V-1 

Wj^i = ^ h°^lXi-ijnodN, (21) 
1=0 

N-l 

= XI 9j,lXi-lmodN- (22) 
1=0 

The transfer function of a DWT filter {hi} at frequency / is defined via the Fourier transform as: 

oo L— 1 

H{f) = J2 hie-''^''^' = J2 hie-'^""^', (23) 

l=-oo 1=0 

with the squared gain function 

H{f) ^ \H{f)f. (24) 
The transfer functions of the j-th level MODWT wavelet and scaling filter are given as follows: 

Hjif)^H,{2^-'f)l[Gi{2^f), (25) 

1=0 

i-i _ 

Gj{f)^l[Gii2^f). (26) 

1=0 

4-3. Energy decomposition of a stochastic process 

For our analysis, it is important to show that we are able to decompose the energy of a stochastic 
process on a scale-by-scale basis, i.e., we can get the energy contribution of every level j, with the 
maximum level of decomposition J"* < log2 N. 

Proposition 4. Energy decomposition in discrete time 

The energy of the time series Xi, i = 1, . . . , N — 1 can be decomposed on a scale-by-scale basis 
J"» < log2 N so that 

jm 

||Xf = Xl|Wjf + ||Vjmf (27) 

i=i 

where m? = YJ^-,^XI \\W ,\? = T^S,^ W^, || V,;^ f = K?™, ^^dW^ andYj are N 

dimensional vectors of the j-th level MODWT wavelet and scaling coefficients. 

The proof of the energy decomposition 27 using the MODWT can be found in Appendix Ap- 
pendix A. 5, 

It is worth noting that the squared norm ||.|{ is similar to the realized measure discussed in the 

preceding sections. For example, in the case of the realized variance estimator (RV) the energy 
decomposition can reveal the contributions of particular scales to the overall energy, hence we can 
see what form this realized measure takes. 
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For simplicity in notation let us define a vector W that consists of J"* + 1 and AT— dimensional 

subvectors, where the first J™ subvectors are the MODWT wavelet coefficients at levels j = 1, ...,N 
and the last subvector consists of the MODWT scaling coefficients at level J"^: 

W = (WlW2,...,Wjrn,VjrnY , (28) 

i.e., for Equation 27 the following holds: 

||Xf = ^||W,f + ||Vj™f = ^ \\Wjf (29) 



4-4- Wavelet variance 

For a real-valued covariance stationary stochastic process Xi with mean zero, the sequence of 
the MODWT wavelet coefficients for all j,i > unaffected by the boundary conditions, 

obtained by the wavelet decomposition at scale j is also a stationary process with mean zero. The 
variance of the wavelet coefficients at scale j is the wavelet variance, i.e., 

^{X)j = var{W^x)j,i) (30) 

While the variance of a covariance stationary process Xi is equal to the integral of the spectral 
density function Sx{-), the wavelet variance at a particular level j is the variance of the wavelet 
coefficients (VF(x)j,i) with spectral density function S(^x)j{-)- 



ix)j = f'^ S^x)j{f)df = C'^ 'Hj{f)S^x)if)df, (31) 

J-l/2 J-1/2 



where Tijif) is the squared gain function of the wavelet filter hj (Percival and Walden, 2000). 
Since the variance of a process Xi is the sum of the contributions of the variances at all scales we 
can write: 

oo 

var{Xi) = J2^fx)j (32) 
However, for a finite number of scales we have: 



var{Xi) = / S'(x)(/)d/ = ^lx)j + (33) 

•^-1/2 j=i 



4-5. Estimation of the wavelet variance 

Following Percival and Walden (2000) we define the unbiased MODWT wavelet variance es- 
timator for a covariance stationary Gaussian process Xi (or a covariance stationary process after 
d— th backward differences), L > 2d, at level j as: 

1 _ 
10 



where Mj = N — Lj + 1 > is the number of j-th level MODWT coefficients unaffected by 
the boundary conditions. If we take all the MODWT wavelet coefficients N we obtain a biased 
MODWT variance estimator. However, as — >■ oo the ratio goes to unity, so consequently the 
estimators gives more or less the same results. 

The estimator i^fx)j '^^ ^ random variable, so it is of interest to know how close it is to 
the real value of the wavelet variance i''^x)j ■ ^® assume that the sequences of wavelet coefficients 

W(^X)j,i normally distributed random variables with zero mean and spectral distribution function 
S(x)j{f)- Following the result of Percival (1995), if S(^x)j{f) > almost everywhere and if 



1/2 



Sfx),if)df<oo, (35) 

-1/2 

then the MODWT variance estimator i'fx)j unbiased and asymptotically normally distributed 

estimator with large sample variance 2 J^y2 '^(x)j(f)'V /^j ~ see also Percival and Walden (2000) 
and Serroukh et al. (2000). Thus, for large enough Mj, we can write 



N{0, 1). (36) 



Note also that Serroukh et al. (2000) derived the asymptotic distribution of the MODWT 
wavelet variance estimator for other classes of processes, not only Gaussian and linear ones. 

An interesting comparison of the DWT and MODWT wavelet variance estimators is discussed 
in Percival ( 1995 ) . Asymptotic relative efficiency is used to compare the DWT with the MODWT 
wavelet variance estimator at the first level {j = 1). Further, it is assumed that the first-level 
wavelet coefficient sequence (VF(x)i,j) has a square summable spectral density function >S'(x)i(/)) 
for which S^x}iif) > holds almost everywhere. The DWT and MODWT variance estimators 
at the first level, denoted as and v'^j^s^i-, are then asymptotically normally distributed with 

mean . The asymptotic relative efficiency of the DWT estimator with respect to the MODWT 
estimator is defined as follows: 

e-{y(D)i,^{M)i) = jim 



/-1/2 ^lx)i if)df 

/-f/2 sfxnim + 1% %)i (I + df 



(37) 



Identity 37 clearly shows that the DWT variance estimator cannot be more efficient than the 
MODWT estimator since we assume that any spectral density function is nonnegative. In some 
cases, we can obtain a significant reduction in the large sample variance by using the MODWT 
estimator. Using large Monte Carlo simulations Percival (1995) show, for example, that for a 
white noise process the asymptotic relative efficiency of the DWT variance estimator with respect 
to the MODWT variance estimator using the Daubechies D{A) is 0.82, i.e., the variance of the 
MODWT-based estimator is significantly lower. 
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4-6. Wavelet-based realized variance estimator 

We have introduced all the theory needed to return to the estimation of the realized variance, 
so we can proceed to defining our wavelet-based estimators in this section. 

Definition 2. Wavelet-based realized variance 

The wavelet-based realized variance over [t — h,t], for < h <t <T, is defined by 

i=i k=i 

where n is the number of intraday observations in [t — h, t] and J™" is the number of scales we 
consider. W-^_^, fc^ are the MODWT coefficients defined in 29 on returns data rt^h on scales 
j = 1, . . . , J™" + 1, where J"^ < log2 n. 

Proposition 5. Wavelet-based realized variance as an unbiased variance estimator 
If the return process is square-integrable and fit ^ 0, then for any value of n> 1, 



E[RVt,h\Tt]=E 



(39) 



Proposition 6. Consistency of wavelet-based realized variance 
The wavelet realized variance provides a consistent nonparametric measure of the variance, 

- — -(WRV) 

RVIj, '=RVt,h, 0<h<t<T, (40) 

where the convergence is uniform in probability. 

The proof of Proposition 5 and Proposition 6 is provided in Appendix A. 6, 
Consequently, normal mixture hypothesis holds: 

n,.k{.a,wir'"}~A'(Ma,wr'''). («) 

The wavelet realized variance estimator decomposes the realized variance. Thus it is unbiased 
estimator and with increasing sampling frequency n — >■ oo is also a consistent estimator of the 
quadratic variation. Still, under the assumption of the presence of noise (see Proposition 3) in the 
data, both the wavelet realized volatility and the realized volatility are biased. Therefore, in the 
following section we will introduce the concept of treating jumps using wavelets. This will be the 
last step in proposing the final estimator, which will be robust to jumps as well as microstructure 
noise. 



4.7. Realized jump estimation using wavelets 

As discussed earlier, the presence of jumps in the process generating stock prices is needed to 
describe the real-world data well. This is commonly done by considering the price process {pt)t£[o,T] 
from Proposition 3 , But then the realized variance estimator, although being able to deal with the 
noise, will still contain the jump variation. Thus it is natural to separate the two components of 
the quadratic variation: 
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(42) 

Jump Variation 

Once we have the jumps separated from the process, we are able to determine the {p,p)^- 

Here, we utihze wavelets again as they can also be used for estimating jumps and separating 
integrated variance from jump variation. The sample path of pt has a finite number of jumps 
(a.s.). Fan and Wang (2007) apply wavelet jump detection to the deterministic functions with 
i.i.d. additive noise et of Wang (1995) and Raimondo (1998). Following these references, we can 
make the following proposition. 

Proposition 7. Suppose that the sample path of pt has q = Ni < oo jumps at ti, . . . ,Tq, 



lim P \ q = q, \^ t; — r J < n log n 

\ 1=1 

where q = Nt is the number of estimated jumps with locations 



X\=l, (43) 



For precise jump detection we use again MODWT wavelet coefficients. Using the MODWT 
we are not restricted to a dyadic sample length. Unlike the ordinary DWT, the MODWT is 
not subsampled, therefore each wavelet coefficient is associated with time position on every scale. 
Moreover the MODWT uses zero phase filters. Hence, the MODWT more accurately determines 
exact location of the jump. 

For the estimation of jump location we use the universal threshold (Donoho and Johnstone, 
1994) on the 1^* level wavelet coefficients of yt over [t — h,t], Wi,fc. If for some Wi,fe 

|>Vi,fc| > dV21ogn, (44) 

then fi = {k} is the estimated jump location with size yri_^ — yri_ (averages over [fi,fi + 5„] 
and [f;,f; — Sn], respectively, with Sn > being the small neighborhood of the estimated jump 
location ±5„) and where d is median absolute deviation estimator defined as median{\Wi^k\j k = 
1, .... , n}/0.6745, for more details see Percival and Walden (2000). 

The jump variation is then estimated by the sum of the squares of all the estimated jump sizes: 

Nt 

WJV = ^iyr,^-yr,_f. (45) 
1=1 

Fan and Wang (2007) proved that we are able to estimate the jump variation from the process 
consistently. 

Proposition 8. Consistency of the wavelet jump estimator 
With n — >■ oo 

plim„^^WW = 5^jf, (46) 

1=1 

with the convergence rate N~^/^. 
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In the following analysis, we will be able to separate the continuous part of the price process 
containing noise from the jTimp variation. This result can be found in Fan and Wang (2007) and it 
states that the jump-adjusted process y^'^^ = Vf^ WJV converges in probability to the continuous 
part without jumps, thus its variation is exactly the first part of Eq. 42, the integrated variance. 
Thus, if we are able to deal with the noise in y^-^\ we will be able to estimate the true {p,p)t- 

4-8. Jump wavelet two scale realized variance estimator 

Finally, let us propose an estimator of realized variance that is able to estimate jumps from the 
process consistently. With n — > oo, it is be able to recover the true integrated variance from noisy 
data. Moreover, we can use it to decompose the integrated variance into J"* components. In the 

final estimator, we utilize what we already know: the TSRV estimator of Zhang et al. (2005), the 
wavelet representation theorem from Proposition 2 and the jump detection method proposed by 
previous section. 

Definition 3. Jump wavelet TSRV (JWTSRV) estimator 

' (estimator, J) 

Let RV^fi denote an estimator of realized variance over [t — h,t\, for < h < t < T, 

on the jump-adjusted observed data, y'^'^f^ = yt^h ~ '^l- jump-adjusted wavelet two-scale 

realized variance estimator is defined as: 

where RV^ f^ ' ^ = ^Yl^=iJ2'j=t^J2k=i'^'^t ^ fc/j obtained from wavelet coefficient estimates 
on a grid of size n = n/G on the jump-adjusted observed data, y^'^ = yt,h — Jl- 
Proposition 9. JWTSRV unbiased variance estimator 

If the return process is square-integrable and m = 0, then for any value of n> 1, 



E[RVt,h\J't] = E 



(JWTSRV), 



(48) 



Proposition 10. Consistency of JWTSRV 
The wavelet realized variance provides a consistent nonparametric measure of the variance, 

(JWTSRV) , , 

plim„^^ ijy;^;, =RVt,h, 0<h<t<T, (49) 

where the convergence is uniform in probability. 

The proof of Proposition 9 and Proposition 10 is provided in Appendix A. 7. The JWTSRV 
estimator decomposes the realized variance into an arbitrary chosen number of investment horizons 
and jumps. Thus it is unbiased estimator and with increasing sampling frequency ri —>■ oo it is 
also a consistent estimator of the integrated variance as it converges in probability to the true 
integrated variance {p,p)f of the process pt- 

Thus we have defined a wavelet-based realized variation theory which is able to estimate realized 
variance consistently in the presence of noise and jumps. In the next sections, we will test the small 
sample performance of the estimators and perform an empirical study on real-world data. 

In small samples, a small sample refinement can be constructed (Zhang et al., 2005): 



14 



{JWTSRV,adj) 




(-1) rf~~{JWTSRV) 



(50) 



When referring to the reahzed volatility estimated using our JWTSRV estimator, we will refer to 



5. Numerical study of the small sample performance of the estimators 

In this section, we study the small sample performance of estimators using Monte Carlo sim- 
ulations designed to capture the real nature of the data. We use several experiments using dif- 
ferent volatility models, including a fractional stochastic volatility model capturing long memory 
in volatility. The main purpose of the study is to show that the proposed methodology is robust 
to noise and jumps under various settings. We add also numerical study of the forecasting per- 
formance of the estimators. Each experiment compares the performance of the realized variation 
estimator, the bipower variation estimator, the two-scale realized volatility, the realized kernel, 
and the jump wavelet two-scale realized variation defined by Eq. 47, All the estimators are ad- 
justed for small sample bias, similarly to Eq. 50, For convenience, we refer to the estimators in 
the description of the results as RV, BV, TSRV, RK and JWTSRV, respectively. Moreover, we 
also compare the minimum variance estimators TSRV* and JWTSRV*, which minimize the total 
asymptotic variance of the estimators (Zhang et al., 2005) . 

5.1. Jump- diffusion model with stochastic volatility 

The first data generating model we assume in our study is a one-factor jump-diffusion model 
with stochastic volatility, described by the following equations: 



where Wx and Wy are standard Brownian motions with correlation p, and ctdNt is a compound 
Poisson process with random jump size distributed ss N ^ (0, aj). We set the parameters to values 
which are reasonable for a stock price, as in Zhang et al. (2005), who used model 51 without jumps, 
/i = 0.05, a = 0.04, /t = 5, 7 = 0.5, p = —0.5 and aj = 0.025. The volatility parameters satisfy 
Feller's condition 2na > 7^, which keeps the volatility process away from the zero boundary. We 
generate 10, 000 independent sample paths ^ of the process using the Euler scheme at a time interval 
of 5 = Is, each with 6.5 x 60 x 60 steps n = 23, 400, corresponding to a 6.5 trading hour day. On 
each simulated path, we estimate {p,p)f over t = 1 day, as the parameter values are annualized 
(i.e., t = 1/252). The results are computed for sampling of 5 minutes (M=78) for RV, BV, TSRV, 
RK and JWTSRV, as well as for the optimal sampling frequency found by minimizing the total 
asymptotic variance of the estimators for TSRV* and JWTSRV*. 

We repeat the simulation with different levels of noise as well as different numbers of jumps. 
We assume that the market microstructure noise, et, comes from a Gaussian distribution with 
different standard deviations: {E[e'^])^/'^ = {0,0.0005,0.001,0.0015}. Thus, the first simulated 
model, (E[e2])V2 = q, has zero noise. The remaining three models have levels of microstructure 
noise corresponding to 0.05%, 0.1% and 0.15% of the value of the asset price. 

^We have also computed the results for lower number of simulations, up to 1,000 generated independent sample 
paths and we found that the results do not change at all. These results are available upon request from authors. 




dXt 



ifi - a^/2)dt + atdWx,t + ctdNt 
K{a - a1)dt + jatdWy^t, 



(51) 
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Table 1: Bias (variance in parenthesis) xlO* of all estimators from 10,000 simulations of jump-diffusion model with 
€i = 0, €2 = 0.0005, es = 0.001, €4 = 0.0015. RV - 5 min. realized variance estimator, BV - 5 min. bipower variation 
estimator, TSRV - 5 min. two-scale realized volatility, JWTSRV - 5 min. jump wavelet two-scale realized variance. 
TSRV* and JWTSRV* are minimum variance estimators (see A.48), and RK is Realized Kernel. 





RV 


BV 


TSRV 


TSRV* 


RK 


JWTSRV 


JWTSRV* 








No 


Jumps 










0.90 (0.65) 


^4.13 (0.82) 


-6.03 (0.43) 


-0.28 (0.02) 


-15.18 (2.51) 


-6.08 (0.43) 


-0.37 (0.02) 


£2 


100.10 (0.93) 


97.36 (1.18) 


-5.25 (0.45) 


0.98 (0.51) 


-4.40 (2.63) 


-3.86 (0.45) 


2.29 (0.52) 


53 


394.14 (2.10) 


412.43 (2.87) 


-5.15 (0.45) 


-1.31 (0.90) 


19.66 (2.91) 


0.19 (0.48) 


3.96 (0.93) 


£4 


885.81 (6.40) 


949.39 (8.00) 


-4.52 (0.43) 


-0.47 (1.34) 


52.94 (3.13) 


7.71 (0.68) 


11.93 (1.48) 








One Jump 








ei 


247.73 (19.31) 


53.84 (1.85) 


236.63 (18.64) 


245.55 (18.09) 


225.41 (23.19) 


-5.64 (0.44) 


-0.25 (0.02) 


£2 


354.79 (20.91) 


164.24 (2.77) 


246.24 (19.67) 


253.69 (19.61) 


241.88 (23.10) 


-0.35 (0.48) 


4.36 (0.52) 


53 


648.69 (23.12) 


495.58 (5.15) 


241.06 (19.79) 


251.24 (20.44) 


260.10 (25.62) 


18.12 (0.64) 


23.94 (1.10) 


C4 


1139.00 (27.54) 


1044.80 (10.79) 


248.00 (20.30) 


256.50 (21.02) 


303.39 (25.26) 


68.29 (1.41) 


64.39 (2.29) 








Two Jumps 








ei 


503.32 (41.12) 


117.87 (3.84) 


489.24 (39.47) 


601.61 (38.99) 


471.67 (47.36) 


-6.27 (0.43) 


-0.36 (0.02) 


52 


616.80 (41.99) 


237.65 (4.56) 


500.37 (39.61) 


613.16 (39.69) 


489.82 (45.66) 


3.43 (0.49) 


7.41 (0.54) 


53 


910.28 (44.71) 


582.94 (7.67) 


499.62 (39.83) 


508.96 (39.62) 


617.36 (48.27) 


38.99 (0.81) 


43.39 (1.25) 


54 


1398.40 (47.55) 


1160.20 (15.04) 


496.34 (39.15) 


505.27 (38.93) 


661.50 (47.76) 


108.73 (2.34) 


113.96 (3.06) 








Three Jumps 








51 


772.53 (62.38) 


191.00 (6.58) 


753.28 (60.11) 


766.80 (58.86) 


730.70 (72.17) 


-6.62 (0.46) 


-0.37 (0.02) 


52 


858.07 (61.60) 


312.01 (7.34) 


741.10 (58.62) 


759.90 (58.56) 


720.73 (68.89) 


6.04 (0.51) 


10.21 (0.53) 


53 


1169.30 (68.71) 


671.31 (10.71) 


756.73 (61.89) 


767.36 (60.72) 


769.49 (74.86) 


69.16 (0.96) 


61.90 (1.37) 


54 


1650.50 (69.52) 


1257.80 (18.55) 


742.31 (68.93) 


757.31 (59.37) 


787.06 (71.48) 


160.10 (3.19) 


167.24 (3.94) 



Moreover, we add different amounts of jumps, controlled by intensity A from the Poisson process 
ctdNt- We start with A = 0, with model 51 reducing to a modification of the standard Heston 
volatility model without jumps, and continue with jump coefficients implying up to three jumps 
per day in the process. This number is realistic according to findings in the literature. The size 
of the jumps is controlled by parameter aj, which is set to 0.025, implying that a one standard 
deviation jump changes the price level by 2.5%. Finally, we have 16 models with different levels of 
noise and numbers of jumps, and we compare the bias of all the estimators for each simulated day. 

Table 1 shows the results. The first model, without jumps, corresponds to the findings of Zhang 
et al. (2005) and Ait-Sahalia and Mancini (2008), although we add a higher level of noise to the 
simulations as suggested by the literature. The results show how robust the TSRV-based and RK 
estimators are to an increase in noise. Even a small increase in the magnitude of noise causes large 
bias in the other estimators, but the TSRV-based and RK estimators contain bias of order less 
than 10""^. What we add to the original results of Zhang et al. (2005) and Ai't-Sahalia and Mancini 
(2008) are jumps. While TSRV and RK are robust to an increase in noise, they are not robust to 
an increase in jumps at all. Prom the rest of the results, we can see how the wavelets detect all 
of the jumps in the process and the JWTSRV stays unbiased. Prom the results we can also see 
that with a mixture of relatively high noise and a large number of jumps in the process even the 
JWTSRV estimator suffers from bias. This suggests that jumps are sometimes indistinguishable 
from noise and remain undetected under the large noise. We can also see that the BV is able to 
deal with jumps to some extent, but is hurt heavily by noise. 

5.2. Fractional stochastic volatility model 

Empirical evidence suggests that the volatility process may exhibit long memory. Previous 
models approximate this behavior, but a much more powerful class of models designed to capture 
long memory is known by the literature, namely, fractional Brownian motion. Instead of describing 
the solution and method of simulation of this class of models here, we rather point the interested 
reader to Comte and Renault (1999) and Marinucci and Robinson (1999) for more details. 
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Table 2: Bias (variance in parenthesis) x lO** of all estimators from 10,000 simulations of fractional stochastic volatility 
model with Hurst parameter H = 0.5 with ei = 0, e2 = 0.0005, €3 = 0.001, €4 = 0.0015. RV - 5 min. realized 
variance estimator, BV - 5 min. bipower variation estimator, TSRV - 5 min. two-scale realized volatility, JWTSRV 
- 5 min. jump wavelet two-scale realized variance. TSRV* and JWTSRV* are minimum variance estimators (see 
A.48), and RK is Realized Kernel. 





RV 


BV 


TSRV 


TSRV* 


RK 


JWTSRV 


JWTSRV* 










No Jumps 










7.65 (10.51) 


-19.57 (13.32) 


-26.55 (6.86) 


-1.16 (0.23) 


-66.40 (39.16) 


-26.80 (6.86) 


-1.48 (0.23) 


62 


104.62 (11.14) 


82.08 (14.41) 


-26.79 (6.70) 


-0.41 (0.86) 


-59.74 (38.38) 


-25.51 (6.71) 


0.74 (0.86) 


«3 


407.48 (15.07) 


383.91 (19.41) 


-23.47 (6.71) 


-0.98 (1.45) 


-20.79 (41.69) 


-18.27 (6.74) 


4.32 (1.48) 


«4 


896.20 (22.26) 


888.97 (29.63) 


-25.28 (6.85) 


-6.32 (2.23) 


19.06 (44.66) 


-13.76 (7.04) 


6.14 (2.36) 










One Jump 








ei 


254.70 (32.31) 


97.88 (18.00) 


219.71 (27.19) 


249.51 (19.81) 


167.85 (67.92) 


-27.29 (6.69) 


-1.42 (0.25) 


62 


356.65 (32.73) 


196.24 (19.36) 


219.05 (26.04) 


247.03 (18.80) 


184.96 (67.27) 


-20.04 (6.68) 


4.05 (0.84) 


£3 


654.63 (37.40) 


507.79 (24.60) 


222.84 (27.44) 


249.36 (20.25) 


213.83 (70.42) 


1.88 (7.19) 


24.29 (1.64) 


«4 


1151.80 (45.69) 


1026.90 (36.71) 


226.66 (27.63) 


251.67 (21.35) 


266.50 (73.94) 


39.10 (8.15) 


60.25 (3.10) 








Two Jumps 








«1 


510.21 (53.31) 


217.50 (22.70) 


470.75 (47.16) 


505.47 (38.33) 


411.80 (97.74) 


-25.56 (6.75) 


-0.42 (0.26) 


«2 


611.27 (57.48) 


317.64 (24.09) 


471.07 (49.70) 


506.63 (40.45) 


424.19 (101.42) 


-20.62 (6.88) 


5.74 (0.87) 


«3 


914.79 (60.52) 


636.31 (30.92) 


476.78 (49.28) 


505.09 (40.70) 


466.31 (103.95) 


21.00 (7.38) 


42.32 (1.79) 


«4 


1396.70 (67.41) 


1155.20 (42.77) 


474.58 (47.27) 


504.16 (40.06) 


506.18 (103.54) 


93.30 (9.22) 


117.05 (4.00) 








Three Jumps 








€1 


765.95 (78.40) 


346.13 (28.96) 


719.80 (69.95) 


750.18 (57.99) 


670.56 (134.88) 


-23.75 (6.88) 


-1.96 (0.26) 


£2 


855.63 (76.82) 


436.22 (29.67) 


713.92 (66.84) 


750.49 (58.47) 


666.32 (127.53) 


-15.91 (6.74) 


9.47 (0.88) 


«3 


1161.90 (81.72) 


762.38 (37.14) 


721.15 (68.21) 


758.76 (58.42) 


705.35 (134.87) 


35.08 (7.65) 


61.87 (1.96) 


€4 


1662.10 (95.44) 


1299.40 (52.60) 


722.60 (69.09) 


758.79 (59.70) 


746.30 (136.93) 


135.71 (10.19) 


162.66 (4.72) 



In our simulations, we use the fractional jump-diffusion model: 

dXt = (m - <yll'i)dt + atdW^^t + ctdNt 

d(^H,t = - (^H,t)dt + ldWH,t, (52) 

where Wx is a standard Brownian motion, dWH,t is a fractional Brownian motion (FBM) with 
Hurst parameter H G (0, 1] and ctdNt is a compound Poisson process with random jump size 
distributed as A'^ ~ (0, aj). We set the parameters to values /i = 0.05, a = 0.2, k = 20, 7 = 0.012 
and (Tj = 0.025 as in Ai't-Sahalia and Mancini (2008), although these authors use a process without 
jumps. 

We generate 10, 000 independent sample pathe^ of the process using the Euler scheme at a time 
interval oi 5 = Is, each with 6.5 x 60 x 60 steps n = 23, 400, corresponding to 6.5 trading hours. 
The results are computed for sampling of 5 minutes (M=78) for RV, BV, TSRV, RK and JWTSRV, 
as well as for the optimal sampling frequency found by minimizing the total asymptotic variance 
for TSRV* and JWTSRV*. We again repeat the simulation with different levels of noise as well 
as different numbers of jumps. We assume that the market microstructure noise, e^, comes from 
a Gaussian distribution with different standard deviations: (i?[e^])^/^ = {0,0.0005,0.001,0.0015}, 
and we again start without jumps, and continue with jump coefficients implying up to three jumps 
per day in the process. Finally, we have 16 models with different levels of noise and numbers of 
jumps, and we compare the bias of all the estimators for each simulated day on three processes 
with different long memory parameters. 

Increments of the volatility process with H G (0.5, 1] exhibit the desired long memory. Thus 
we will study this model for a Hurst exponent equal to H = {0.5, 0.7, 0.9}. While the first case has 
independent increments, the second and third cases exhibit quite strong long memory processes in 
volatility. 



^We have also computed the results for lower number of simulations, up to 1,000 generated independent sample 
paths and we found that the results do not change at all. These results are available upon request from authors. 
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Table 3: Bias (variance in parenthesis) x lO** of all estimators from 10,000 simulations of fractional stochastic volatility 
model with Hurst parameter H = 0.7 with ei = 0, e2 = 0.0005, €3 = 0.001, €4 = 0.0015. RV - 5 min. realized 
variance estimator, BV - 5 min. bipower variation estimator, TSRV - 5 min. two-scale realized volatility, JWTSRV 
- 5 min. jump wavelet two-scale realized variance. TSRV* and JWTSRV* are minimum variance estimators (see 
A.48), and RK is Realized Kernel. 





RV 


BV 


TSRV 


TSRV* 


RK 


JWTSRV 


JWTSRV* 










No Jumps 










9.47 (10.57) 


-14.18 (13.44) 


-25.81 (6.87) 


-0.62 (0.24) 


-61.83 (39.91) 


-26.17 (6.86) 


-0.94 (0.23) 


62 


106.09 (11.24) 


78.59 (14.57) 


-22.93 (6.73) 


-0.29 (0.84) 


-49.16 (39.28) 


-21.66 (6.75) 


0.86 (0.84) 


«3 


404.06 (14.44) 


380.66 (18.75) 


-23.64 (6.79) 


-1.01 (1.45) 


-13.44 (43.10) 


-17.93 (6.88) 


4.50 (1.48) 


«4 


899.67 (22.67) 


895.53 (29.96) 


-21.95 (6.89) 


-1.66 (2.19) 


32.94 (46.66) 


-9.40 (7.12) 


10.72 (2.33) 










One Jump 








ei 


260.24 (32.42) 


99.77 (17.58) 


226.07 (27.77) 


252.00 (19.93) 


175.24 (71.81) 


-24.61 (6.75) 


-0.66 (0.24) 


62 


361.23 (33.51) 


204.48 (19.56) 


222.55 (26.96) 


250.42 (19.87) 


194.31 (70.22) 


-20.36 (6.68) 


3.47 (0.85) 


^3 


658.78 (36.67) 


507.47 (24.81) 


229.15 (26.77) 


253.33 (20.29) 


221.70 (71.31) 


1.16 (7.28) 


21.96 (1.62) 


«4 


1140.50 (47.95) 


1014.50 (37.09) 


221.27 (28.05) 


248.39 (22.10) 


260.55 (74.86) 


35.43 (8.07) 


61.22 (3.19) 








Two Jumps 








«1 


514.66 (55.01) 


219.27 (23.17) 


473.71 (48.22) 


503.64 (39.76) 


430.23 (100.78) 


-23.00 (6.69) 


-1.45 (0.24) 


«2 


615.38 (57.57) 


318.85 (24.74) 


481.64 (49.01) 


508.26 (39.87) 


453.16 (102.60) 


-14.61 (6.95) 


5.80 (0.87) 


«3 


903.32 (59.69) 


630.21 (30.51) 


470.80 (47.55) 


498.66 (39.14) 


467.10 (102.37) 


20.01 (7.23) 


41.69 (1.78) 


«4 


1400.90 (66.50) 


1164.00 (43.24) 


467.00 (46.26) 


505.48 (39.79) 


500.94 (102.72) 


86.73 (9.19) 


115.27 (4.02) 








Three Jumps 








€1 


765.72 (78.57) 


340.76 (28.99) 


720.49 (70.78) 


754.51 (59.35) 


676.34 (135.14) 


-28.45 (6.80) 


-1.85 (0.25) 


£2 


873.97 (79.58) 


452.01 (30.08) 


731.76 (70.61) 


765.04 (59.59) 


682.13 (134.89) 


-12.12 (6.85) 


12.12 (0.88) 


«3 


1164.00 (82.53) 


767.45 (36.59) 


718.24 (67.72) 


752.01 (58.67) 


704.64 (132.81) 


38.63 (7.86) 


63.43 (1.96) 


€4 


1663.50 (91.73) 


1299.90 (48.60) 


731.80 (69.68) 


758.67 (59.03) 


756.10 (138.66) 


141.26 (10.54) 


161.96 (4.84) 



Tables 2, 3 and 4 summarize the results for the different H = {0.5,0.7,0.9}, respectively. The 
results confirm exactly the same behavior for all the estimators as in the previous case without 
long memory. Thus we can conclude that our JWTSRV estimator is robust to jumps and noise on 
small samples even if we consider the volatility process with long memory, and it proved to be the 
best estimator of {p,p)f even on small samples. While we studied only the in-sample performance 
of the estimator, we present the out-of-sample, or forecasting, performance in the next section. 

5.3. One- day- ahead forecasts of IV using JWTSRV 

One of the many potential useful applications of the proposed framework is volatility forecasting. 
In particular, the one-day-ahead return variation forecast, var{pt-\-i\Tt), is of huge interest for 
practitioners. Thus we would like to study the forecasting ability of the proposed methodology as 
well. While we showed that the in-sample performance of the estimators is the same for different 
models and that the JWTSRV estimator tends to consistently estimate {p,p)f regardless of the 
level of noise and number of jumps in the process, we will reduce our simulation scheme to model 
51 with a fixed level of noise and number of jumps. This setting will allow us to study the impact 
of noise and jumps on the forecasting performance of the estimators and to see if the JWTSRV 
holds its power and is able to forecast var{pt+i\J-'t). 

Denoting the annualized one day time interval Ti — Tq = T2 — Ti, 

E [4i l-T^To] = e-«(^-^o)4„ + a(l - e-(^-^o)), (53) 

where follows model 51 and J^t = ^Wt'i^ — ^} information set generated by the instan- 

taneous variance process up to time T. If we use integration operators, we have 

= 1(1 _ e-«(^i-ro))4 + _ To) - -(1 - e-^^^i-^")). (54) 

K K 



E 



! 



a^dt\TTo 
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Table 4: Bias (variance in parenthesis) x lO** of all estimators from 10,000 simulations of fractional stochastic volatility 
model with Hurst parameter H = 0.9 with ei = 0, e2 = 0.0005, €3 = 0.001, €4 = 0.0015. RV - 5 min. realized 
variance estimator, BV - 5 min. bipower variation estimator, TSRV - 5 min. two-scale realized volatility, JWTSRV 
- 5 min. jump wavelet two-scale realized variance. TSRV* and JWTSRV* are minimum variance estimators (see 
A.48), and RK is Realized Kernel. 





RV 


BV 


TSRV 


TSRV* 


RK 


JWTSRV 


JWTSRV* 










No Jumps 










5.90 (10.50) 


-19.58 (13.51) 


-26.67 (6.78) 


-0.50 (0.25) 


-61.72 (40.10) 


-27.11 (6.78) 


-0.92 (0.25) 


62 


110.47 (11.65) 


84.15 (14.78) 


-22.39 (7.05) 


0.18 (0.83) 


-47.77 (40.61) 


-21.14 (7.07) 


1.33 (0.84) 


«3 


399.57 (15.33) 


372.67 (19.88) 


-29.78 (6.77) 


-1.92 (1.45) 


-34.79 (42.18) 


-25.00 (6.83) 


3.31 (1.47) 


«4 


882.50 (22.98) 


879.81 (30.30) 


-28.32 (6.74) 


-0.72 (2.18) 


14.36 (44.32) 


-17.21 (6.93) 


10.63 (2.30) 










One Jump 








ei 


269.61 (35.26) 


100.30 (17.80) 


233.23 (29.92) 


258.49 (21.56) 


184.42 (73.31) 


-25.58 (6.86) 


-2.19 (0.25) 


<:2 


364.35 (34.40) 


200.35 (19.23) 


226.94 (28.28) 


258.79 (21.54) 


201.67 (71.72) 


-21.96 (6.82) 


4.05 (0.85) 


^3 


648.93 (38.20) 


498.06 (24.94) 


218.29 (27.72) 


249.82 (20.78) 


214.87 (74.11) 


-5.09 (7.19) 


23.16 (1.66) 


«4 


1143.10 (44.73) 


1017.00 (35.55) 


221.50 (27.14) 


250.37 (21.52) 


255.7:5 (71.84) 


36.01 (8.13) 


60.87 (3.15) 








Two Jumps 








«1 


507.64 (54.73) 


217.73 (23.69) 


468.53 (48.86) 


499.98 (37.75) 


422.53 (106.44) 


-27.23 (7.05) 


-1.50 (0.25) 


«2 


618.08 (57.83) 


323.80 (24.72) 


475.53 (49.28) 


505.72 (39.5:1) 


446.02 (102.66) 


-13.93 (6.99) 


6.57 (0.88) 


«3 


902.48 (63.40) 


620.44 (30.54) 


470.85 (50.44) 


502.56 (40.29) 


462.64 (106.41) 


15.21 (7.49) 


43.52 (1.81) 


«4 


1399.10 (70.64) 


1156.50 (43.00) 


470.35 (49.76) 


498.97 (40.92) 


504.29 (109.07) 


87.73 (9.16) 


114.08 (3.94) 








Three Jumps 








€1 


767.20 (77.56) 


337.59 (28.64) 


721.80 (68.54) 


755.62 (56.93) 


674.80 (130.51) 


-25.42 (6.82) 


-2.66 (0.25) 


«2 


866.12 (78.84) 


443.31 (30.34) 


720.71 (69.21) 


754.90 (58.38) 


689.69 (134.96) 


-13.72 (6.83) 


11.64 (0.91) 


«3 


1164.80 (83.67) 


759.78 (36.66) 


730.27 (69.86) 


758.64 (59.37) 


713.23 (135.13) 


41.37 (7.85) 


60.48 (1.93) 


€4 


1661.80 (93.00) 


1303.50 (50.63) 


724.24 (69.10) 


752.55 (59.56) 


762.91 (145.29) 


142.24 (10.54) 


163.84 (4.82) 



If we want to express the one-day-ahead forecast, we simply use equations 53 and 54 and we get: 



E 



(55) 



where D = Tm+i -Tm = T^ 
but it is not feasible, as E 



Tm-i- Equation 55 is the exact conditional forecast of J^"'~^^ dt, 



rT„. 



is not observed in practice. But if we replace 

this term by the estimate of the integrated variance on day m wc arrive at a simple method for 
forecasting the integrated variance on day m + 1. In empirical applications the true underlying 
model parameters are unknown and the properties of the observed data differ from the simulated 
ones, even though the simulations are based on estimated parameters on real-world data. Hence, 
the estimation is required to be realistic, and the AR(1) process seems to serve well in this case. 

We use the simulation scheme for model 51 from the previous section. This time, we simulate 
101 "continuous" sample paths over days [0,ri], . . . , [r99,rioo], [TiocTioi], that is, 101 x 23,400 
log returns. We split each simulated path into two parts. The first part, of 100 x 23,400, is used 
to estimate the time series of 100 daily integrated variations using the tested estimators. Then, 
the AR(1) model is used to estimate the coefficients of forecast equation 55, where the conditional 
expectation in the right-hand side is replaced by the estimated integrated variation. The second 
part, the last (101th) day, is saved for out-of-sample comparison purposes as the true integrated 
variance of the day, which is compared with the AR(1) forecast of the integrated variance for the 
m + 1th day. This procedure is repeated for each simulated sample path of 101 x 23, 400 log returns 
and all the estimators tested in the previous exercise. 

We employ the traditional Mincer and Zarnowitz (1969) approach to assess the forecasting 
performance of the individual estimators. We compare alternative variance forecasts by projecting 
the true realized integrated variance on daym + 1, J^^+^ afdt, on a constant and various estimator 
forecasts. For example, we evaluate the JWTSRV forecasting performance by running the following 
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Table 5: Out-of-sample Mincer-Zarnowitz regressions (Eq. 57) on model with no jumps. Results significant at 95% 
are in bold; OLS standard errors in parenthesis. 



Joint Mincer-Zarnowitz regression 





const. 


RV 


BV 


TSRV 


RK 


JWTSRV 


fi2 




-0.055 (0.003) 
0.009 (0.003) 
0.009 (0.003) 
0.009 (0.003) 


1.568 (0.099) 
0.072 (0.093) 
0.082 (0.090) 
0.082 (0.090) 


-0.440 (0.103) 
-0.172 (0.078) 

-0.108 (0.077) 
-0.109 (0.077) 


1.070 (0.039) 
1.184 (0.0407) 
1.054 (0.293) 


-0.199 (0.027) 
-0.199 (0.027) 


0.131 (0.294) 


0.895 
0.941 
0.944 
0.944 


Individual Mincer-Zarnowitz regression 




const. 


RV 


BV 


TSRV 


RK 


JWTSRV 




RV 
BV 
TSRV 
RK 

JWTSRV 


-0.059 (0.003) 
-0.063 (0.003) 
-0.002 (0.002) 
-0.002 (0.003) 
0.001 (0.002) 


1.145 (0.013) 


1.167 (0.014) 


0.995 (0.008) 


1.017 (0.016) 


0.997 (0.008) 


0.893 
0.869 
0.940 
0.807 
0.939 


Mincer-Zarnowitz regression for 


minimum variance TSRV estimators 












const. 






TSRV* 




JWTSRV* 




TSRV* 
JWTSRV* 


-.002 (0.001) 
0.001 (0.001) 






0.993 (0.006) 




0.996 (0.006) 


o.<.)r,9 

0.959 



regression: 

(p,p),^^, = a + /3l^/™ + e, (56) 



where v^l^^^w jg ^j^g one-day-ahead forecast of integrated variance from day m to day m + 1 
using the AR(1) prediction. Thus, equation 56 regresses the true reaUzed variance {p,p)x,-^^i from 
day m + 1 on a constant and the variance forecast using the JWTSRV estimator. If the JWTSRV 
estimator performs well, the forecast should be unbiased and the forecast error is small. In other 
words, a = and /3 = 1, and the of the regression is close to 1. Thus we will test the null 
hypothesis of i^o : a = and Hq : P = 1 against the alternatives Ha : a 7^ and Ha : P ^ 1. 

In our simulations, we study a Mincer-Zarnowitz style regression combining several estimators: 

for J = 1, ... , 10, 000 simulated sample paths. is the one-day-ahead forecast of integrated 

variance from day m to day m + 1 given by the AR(1) model for the time series of daily variance 
estimated by the A4 estimator of realized variance. Regression 57 can be naturally interpreted as a 
variance forecast encompassing regression, as a coefficient significantly different from zero implies 
that the information in that particular forecast is not included in the forecasts of other models. 
To test the robustness of the results, we also run individual regressions where we consider only a 
constant and a single forecasting model. Thus we run four separate regressions to supplement the 
joint regression from 57. 

5.3.1. Forecasting without jumps 

We run the simulations for two model settings using model 51 with one jump and with no 
jumps. Let us start with the model without jumps first. The OLS estimates of all the forecast 
evaluation regressions for the model without jumps are reported in Table 5. The results suggests 
that the TSRV performs as the best forecasting vehicle. Comparing the individual regressions, 
the TSRV has the highest B? and the coefficient closest to 1 with an insignificant coefficient, 
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Table 6: Out-of-sample Mincer-Zarnowitz regressions (Eq. 57) on model with 1 jump. Results significant at 95% are 
in bold; OLS standard errors in parenthesis. 



Joint Mincer-Zarnowitz regression 


const. 


RV 


BV 


TSRV 


RK 


JWTSRV 




-0.032 (0.006) 
0.032 (0.010) 
0.032 (0.010) 
0.000 (0.006) 


-0.500 (0.045) 
-1.857 (0.185) 
-1.873 (0.186) 
0.129 (0.122) 


1.538 (0.037) 
1.512 (0.036) 
1.514 (0.036) 
-0.045 (0.043) 


1.251 (0.166) 
1.182 (0.183) 
-0.042 (0.113) 


0.088 (0.098) 
-0.204 (0.059) 


1.078 (0.026) 


0.811 
0.822 
0.822 
0.936 


Individual Mincer-Zarnowitz regression 


const. 


RV 


BV 


TSRV 


RK 


JWTSRV 




RV -0.100 (0.009) 
BV -0.079 (O.OO.S) 
TSRV -0.050 (0.007) 
RK -0.051 (0.008) 
JWTSRV -0.003 (0.002) 


1.123 (0.037) 


1.181 (0.019) 


1.028 (0.032) 


1.041 (0.035) 


1.024 (0.009) 


0.480 
0.788 
0.600 
0.476 
0.935 


Mincer-Zarnowitz regression for 


minimum variance 


< TSRV estimat 


ors 








const. 






TSRV* 




JWTSRV* 


h2 


TSRV* -0.049 (0.007) 
JWTSRV* -0.003 (0.002) 






1.021 (0.032) 




1.014 (0.007) 


0.506 
0.957 



which suggests that the forecasts of the TSRV are biased only very shghtly (as the coefficient is 
significantly different from 1). When looking at the joint regressions, we can see that the addition 
of all the other estimators does not improve this result. Moreover, when the TSRV is included 
in the regression, it is the only significant estimator, meaning that none of the other estimators 
has additional information not included in the TSRV forecast. In other words, adding the other 
estimators' forecasts to the TSRV brings no additional explanatory power to the regression. The 
JWTSRV forecast has the same performance as the simple TSRV, as there are no jumps in the 
simulated process, thus the asymptotic behavior of these two estimators should be the same. The 
JWTSRV is expected to have much better performance in the simulations where we include jumps. 
All the estimators are estimated with a 5-minute sampling frequency. 

In addition, we provide results for the optimal sampling minimizing variance of the estimator 
in the last part of the table. The TSRV* with optimally chosen sampling outperforms the 5 min. 
TSRV. The JWTSRV* again has the same performance as expected. 

5. 3. 2. Forecasting with jumps 

Let's see how the results change when we add a single jump to the simulated model. The 
OLS estimates of all the forecast evaluation regressions for the model with jumps are reported 
in Table 6. Looking at the results of the individual regressions, one can see that the JWTSRV 
largely outperforms all the other estimators, with close to the results from the model without 
jumps from the previous section. This suggests that the JWTSRV is robust to jumps even when 
we consider forecasting. The joint regression confirms this result. The regression including all the 
forecasts using the four considered estimators has the largest explanatory power. Moreover, the 
coefficient of the JWTSRV is significant, while the other coefficients arc not significant, suggesting 
that the other estimators carry no additional information. Taking the JWTSRV forecasts away 
from the regression results in much lower R^. It is interesting to note that in this case all the 
other coefficients are significant, suggesting multicollinearity caused by jumps in the process. The 
reader can also note how the addition of the BV improves the result. In fact, the BV rules the 
TSRV, with much higher R^. In fact, the BV is used for jump detection, so this finding confirms 
the results from the literature. 
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In addition, we again include results for optimal sampling, which minimizes the variance of the 
TSRV-based estimators. In this case again, we can see that the result improves and the JWTSRV* 
yields the best result. 

To conclude this section, the results suggest that when the JWTSRV estimator is used for 
variance forecasting in the presence of jumps and noise, the forecasts will be unbiased even on small 
samples. This makes the JWTSRV estimator a very powerful tool for forecasting the variance of 
stock market returns. With the theoretical results in hand, we can move to empirical examples 
and use the JWTSRV to forecast the volatility of real-world data. 

6. Decomposition of empirical volatility 

In this section, we turn our focus to real-world data estimation of the proposed theory. We 
will test several integrated volatility estimators in comparison to our JWTSRV estimator and 
study their distributional properties. The JWTSRV proved to have lowest bias in the Monte Carlo 
simulations, thus we also expect it to have the best performance on the real data set. 

6.1. Data description 

Foreign exchange future contracts are traded on the Chicago Mercantile Exchange (CME) on a 
24-hour basis. As these markets are among the most liquid, they are suitable for analysis of high- 
frequency data. We will estimate the realized volatility of British pound (GBP), Swiss franc (CHF) 
and euro (EUR) futures. All contracts are quoted in the unit value of the foreign currency in US 
dollars. It is advantageous to use ciu'rcncy futures data for the analysis instead of spot currency 
prices, as they embed interest rate differentials and do not suffer from additional microstructure 
noise coming from over-the-counter trading. The cleaned data are available from Tick Data, Inc.^ 

It is very important to look first at the changes in the trading system before we proceed with 
the estimation on the data. In August 2003, for example, the CME launched the Globex trading 
platform, and for the first time ever in a single month, the trading volume on the electronic 
trading platform exceeded 1 million contracts every day. On Monday, December 18, 2006, the 
CME Globex(R) electronic trading platform started offering nearly continuous trading. More 
precisely, the trading cycle became 23 hours a day (from 5:00 pm on the previous day until 4:00 
pm on current day, with a one-hour break in continuous trading), from 5:00 pm on Sunday until 
4:00 pm on Friday. These changes certainly had a dramatic impact on trading activity and the 
amount of information available, resulting in difHculties in comparing the estimators on the pre- 
2003 data, the 2003-2006 data and the post-2006 data. For this reason, we restrict our analysis to 
a sample period extending from January 5, 2007 through November 17, 2010, which contains the 
most recent financial crisis. The futures contracts we use are automatically rolled over to provide 
continuous price records, so we do not have to deal with different maturities. 

The tick-by-tick transactions are recorded in Chicago Time, referred to as Central Standard 
Time (CST). Therefore, in a given day, trading activity starts at 5:00 pm CST in Asia, continues in 
Europe followed by North America, and finally closes at 4:00 pm in Australia. To exclude potential 
jumps due to the one-hour gap in trading, we redefine the day in accordance with the electronic 
trading system. Moreover, we eliminate transactions executed on Saturdays and Sundays, US 
federal holidays, December 24 to 26, and December 31 to January 2, because of the low activity 



"'http://www.tickdata.com/ 
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Table 7: The table summarizes the daily log-return distributions of GBP, CHF and EUR futures. The sample period 
extends from January 5, 2007 through November 17, 2010, accounting for a total of 944 observations. 





Mean 


St.dev. 


Skew. 


Kurt. 


GBP 


0.0001 


0.0119 


-0.3852 


4.4356 


CHF 


0.0002 


0.0068 


0.2440 


5.4662 


EUR 


0,0002 


0.009!) 


0,ir,:-!() 


1,19.11 



on these days, which could lead to estimation bias. Finally, we are left with 944 days in the 
sample. Looking more deeply at higher frequencies, we find a large amount of multiple transactions 
happening exactly at the same time stamp. We use the arithmetic average for all observations with 
the same time stamp. 

6.2. Statistical properties of unconditional return and integrated volatility 

Having prepared the data, we can estimate the integrated volatilities and study their statistical 
properties as well as the properties of the daily unconditional returns. For each futures contract, 
the daily integrated volatility is estimated using the square root of realized variance estimator, 
the bipower variation estimator, the two-scale realized volatility, the realized kernel and the jump 
wavelet two-scale realized variance defined by 47. All the estimators are adjusted for small sample 
bias. For convenience, we refer to the estimators in the description of the results as RV, BV, TSRV, 
RK and JWTSRV, respectively. The RV and BV estimates are estimated on 5-min log-returns. 
The TSRV and the JWTSRV are estimated using a slow time scale of 5 minutes. 

Table 7 presents the summary statistics for the daily log-returns of GBP, CHF and EUR futures 
over the sample period, t = 1, . . . , 944, i.e., January 5, 2007 to November 17, 2010. The summary 
statistics display an average return very close to zero, skewness, and excess kurtosis which is 
consistent with the large empirical literature started probably by Fama (1965) and Mandelbrot 
(1963). As observed by Andersen et al. (2001), when the log-returns are standardized by the 

1 /2 

integrated volatility, rt / IVj. , the unconditional returns are very close to a Gaussian distribution. 

Table 8 summarizes the unconditional distribution of the daily log-returns standardized by the 
integrated volatility, rt/IV^^'^, and confirms this result. However, quite significant differences can 

be found among the estimators. While the high kurtosis (above 4) for the raw returns is reduced 
to the range of 2.51-2.81 for the log-returns standardized using the integrated volatility estimator, 
there is a notable difference between the estimators. The RV is expected to perform the worst, as 
it should be biased by microstructure noise and jumps, which is confirmed. The TSRV as well as 
the RK are not biased by noise, but it still contains a jump component of integrated variance. The 

1/2 

BV should consistently estimate the jump components; the statistical distribution of rt/IV^ , 
where IVt is estimated by the BV, should be closer to Gaussian. Finally, we expect JWTSRV 
estimator to perform the best, as it proved to be robust to noise and jumps in the Monte Carlo 
simulations. We also borrow the QQ plots plotted in Figure 1 for help. Similarly as Fleming and 
Paye (2011) and Andersen et al. (2011), we ask whether the jumps account for the non-normality 
of the unconditional log-returns standardized by the integrated volatility estimators found in the 
literature. We add the TSRV, RK and JWTSRV estimators for comparison. Figure 1 shows that 
returns standardized by integrated volatility using the JWTSRV provide the best approximation 
of the standard normal distribution. This result is in line with what we expected, as the JWTSRV 
proved to be robust to noise and jumps in our large Monte Carlo study. The result from the BV 
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Table 8: The table summarizes the daily standardized daily log-return distributions for GBP, CHF and EUR futures 
using Tt/IVf^^ and daily distributions of integrated volatility IV^^'^ . Integrated volatility IV^^^ is estimated using 
the RV, the BV on 5-min. log-returns, and the TSRV and JWTSRV on 5 minutes for a slow time scale and the 
RK. The sample period extends from January 5, 2007 through November 17, 2010, accounting for a total of 944 
observations. 



Distributions of rt/IV^^'^ Distributions of IV^ 







GBP futures 








GBP 


futures 






Mean 


St. dev. 


Skew. 


Kurt. 




Mean 


St. dev. 


Skew. 


Kurt. 


RV 


0.0419 


0.8834 


-0.0880 


2.6029 


RV 


0.0075 


0.0038 


1.8394 


7.5736 


BV 


0.0448 


0.9266 


-0.0669 


2.6941 


BV 


0.0073 


0.0037 


1.7336 


6.7996 


TSRV 


0.0451 


0.9026 


-0.0710 


2.5744 


TSRV 


0.0073 


0.0037 


1.7611 


7.0767 


RK 


0.0458 


0.9406 


-0.0757 


2.5162 


RK 


0.0070 


0.0037 


1.8201 


7.6473 


JWTSRV 


0.0489 


0.9035 


-0.0710 


2.7512 


JWTSRV 


0.0071 


0.0037 


1.7629 


7.0112 






CHF futures 








CHF 


futures 




RV 


0.0238 


0.8959 


0.0380 


2.6272 


RV 


0.0076 


0.0029 


1.6875 


8.2794 


BV 


0.0272 


0.9424 


0.0727 


2.7020 


BV 


0.0073 


0.0028 


1.5696 


7.5983 


TSRV 


0.0278 


0.9180 


0.0568 


2.6161 


TSRV 


0.0073 


0.0028 


1.5572 


7.3379 


RK 


0.0281 


0.9530 


0.0425 


2.5371 


RK 


0.0070 


0.0028 


1.8179 


9.9149 


JWTSRV 


0.0389 


0.9253 


0.0611 


2.7170 


JWTSRV 


0.0070 


0.0026 


1.4359 


6.5452 






EUR futures 








EUR 


futures 




RV 


0.0379 


0.9550 


-0.0215 


2.5728 


RV 


0.0068 


0.0031 


1.4785 


5.8493 


BV 


0.0410 


0.9970 


-0.0271 


2.6219 


BV 


0.0066 


0.0031 


1.5001 


5.9803 


TSRV 


0.0397 


0.9638 


-0.0133 


2.5502 


TSRV 


0.0068 


0.0031 


1.4263 


5.4871 


RK 


0.0415 


0.9898 


-0.0069 


2.4497 


RK 


0.0065 


0.0031 


1.5351 


6.2713 


JWTSRV 


0.0452 


0.9587 


0.0014 


2.8144 


JWTSRV 


0.0064 


0.0030 


1.4345 


5.4716 
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(a) GBP 

r, standardized by RV 



(b) CHF 

r, standardized by RV 



(c) EUR 

r, standardized by RV 




-3-2-10 1 2 

r, standardized by BV 



-2-10 1 

r, standardized by BV 




Tf standardized by TSRV 



r, standardized by TSRV 




-2-10 1 2 

Vf standardized by BV 



r, standardized by TSRV 



r, standardized by RK 



fj standardized by RK 



-3-2-10 1 2 3 

r, standardized by JWTSRV 



r, standardized by RK 



-3-2-10 1 2 3 

r, standardized by JWTSRV 




3-2-10 1 2 3 

r, standardized by JWTSRV 



,+ 4-- 


3 




3 


■ ' ' ' ,+ 4-- 














2 




2 




^^^^^^^ 


1 





1 



^^^^^^^^ ' 




-1 
-2 




-1 
-2 






-3 




-3 




' 




' 


il- ■ 



Figure 1: QQ plots of normalized daily log-returns Vt by KV, BV, TSRV, RK and JWTSRV estimators, (a) GBP 
futures, (b) CHF futures and (c) EUR futures 



leaves us puzzled. While it is expected to be robust to jumps, it should be able to perforin better. 
The returns standardized by the BV have higher kurtosis than those standardized by the RV, 
TSRV or RK, thus the BV outperforms these estimators to some extent. However, the JWTSRV 
confirms the theory presented in the previous sections. 

Moving from the distributional properties of the standardized daily log-returns. Table 8 also 
shows the distributional properties of the IV^^'^ estimators. Again, the JWTSRV provides lower 

1/2 . 

estimates of IV^ and is also less volatile than the RV. This finding is consistent with the fact 
that the RV can be affected by microstructure noise, and, as demonstrated in the Monte Carlo 
simulations, the JWTSRV is able to estimate the true integrated variance with the lowest bias in 
the presence of noise and jumps in the data. It is surprising, though, that the average estimate 

1 /2 

of IV^ using the JWTSRV is 6.34% lower than the average estimate from the RV (computed as 
arithmetic averages on the estimators on GBP futures, CHF futures and EUR futures) with kurtosis 

1/2 

12.32% lower than the RV. The average estimate of /V^ using the JWTSRV is 3.76% lower than 

1/2 

the average estimate using GBP, with kurtosis 6.34% lower. Finally, the average estimate of IV^ 
using the JWTSRV is 4.52% lower than the average estimate using the TSRV, with kurtosis 4.39% 

1/2 . 

lower. Finally, the average estimate of IV^ using the JWTSRV is the same as the average estimate 
using the RK with kurtosis 25.39% lower. It is thus interesting that while the TSRV accounts for 
noise but not jumps and the BV accounts for jumps but is not able to deal with noise, they have 
same deviations from the JWTSRV, which seems to estimate the integrated volatility without 
jumps and noise. Most interesting is that the average estimate of the RK is exactly the same as 
the average estimate of the JWTSRV. However, the RK estimates has much higher kurtosis. This 
result shows that the RK is powerful estimator of the realized variance. Finally let us note that 
these differences are economically significant, as they result in different asset pricing. 

6.3. IVt decomposition using wavelets 

From the previous analysis, we could see that the JWTSRV provides the best estimator not 
only theoretically, but also on empirical data sets. Although this is the most important property 
of the JWTSRV, it is not the only one we can take advantage of. Another advantage is that 
by using our estimator, we are able to decompose the integrated variance into several investment 
horizons, or components. In our analysis, we limit ourselves^ to decomposition into four scales 
corresponding to investment horizons of 5-10 minutes, 10-20 minutes, 20-40 minutes and 40-80 
minutes, and the rest (80 minutes up to 1 day). As shown in the theoretical part of this work, 
we can comfortably decompose the integrated variance into these components, as their sum will 
always give the integrated variance estimator. 

More precisely, the components of the JWTSRV from Eq. 47 correspond to various investment 
horizons. Thus, we will refer to these as JWTSRVj-. 



where j = 1, ... ,4 are scales corresponding to 5-10 minutes, 10-20 minutes, 20-40 minutes and 
40-80 minutes, and j = 5 will contain the 80 minutes up to 1 day investment horizon. 



®It should be noted that any investment horizons of interest may be chosen arbitrarily. 
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(a) Daily returns 



(b) Jump variation 
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Figure 2: GBP futures: (a) daily returns, (b) JWTSRV estimated jump variation, (c) IVt estimated by JWTSRV 
(d) decomposition of IVt using JWTSRVj for j = 1, ... ,5 corresponding to investment horizons of 5-10 minutes, 
10-20 minutes, 20-40 minutes, 40-80 minutes and 80 minutes up to 1 day. 
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Figure 3: CHF futures: (a) daily returns, (b) JWTSRV estimated jump variation, (c) IVt estimated by JWTSRV 
(d) decomposition of IVt using JWTSRVj for j = 1, ... ,5 corresponding to investment horizons of 5-10 minutes, 
10-20 minutes, 20-40 minutes, 40-80 minutes and 80 minutes up to 1 day. 
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(a) Daily returns 



(b) Jump variation 
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Figure 4: EUR futures: (a) daily returns, (b) JWTSRV estimated jump variation, (c) IVt estimated by JWTSRV 
(d) decomposition of IVt using JWTSRVj for j = 1, ... ,5 corresponding to investment horizons of 5-10 minutes, 
10-20 minutes, 20-40 minutes, 40-80 minutes and 80 minutes up to 1 day. 

The final decomposition of tlie integrated variance into jumps and several investment horizons 
can be best seen from Figures 2, 3 and 4, which provide the returns, estimated jumps and finally 
integrated volatilities on investment horizons of 5-10 minutes, 10-20 minutes, 20-40 minutes, 40- 
80 minutes and 80 minutes up to 1 day. We provide JWTSRVj, which is the decomposed IVt. It 
is interesting to observe that most of the information of the integrated variance is carried by the 
fastest scale, the 5-10 minute investment horizon. This is about 50% of the total variation. The 
longer the horizon, the lower the contribution of the variance to the total. 

For better illustration, we annualize the square root of the integrated variance in order to 
get the annualized volatility and we compute the components of the volatility on our investment 
horizons. Figure 5 shows this decomposition. The first plot, 5 (a), shows total volatility estimate, 
while 5 (b) to 5 (f) show the investment horizons of 5-10 minutes, 10-20 minutes, 20-40 minutes, 
40-80 minutes and 80 minutes up to 1 day, respectively. The most of the volatility (around 50%) 
comes from the fast, 10-minute investment horizon. This is a new insight we bring to volatility 
modeling. In fact, it is a logical finding, as it shows that volatility is created on fast scales of up 
to 10 minutes rather than on slower scales. 

7. Conclusion 

In this paper, we present the complete theoretical framework of wavelet-based realized variation 
generalizing the current realized variation theory to the time-frequency domain. 

Standing on our theoretical results proposing the wavelet representation theorem, which ex- 
tends the well-known martingale representation theorem, the estimator of wavelet-based realized 
variation is defined together with its theoretical properties. Using wavelets, the estimator is able to 
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Figure 5: Decomposed annualized volatility (by 252 days), (a) total IVf'^'^ estimate on GBP, CHF and EUR futures 
using JWTSRV, (b) volatility on investment horizon of 5-10 minutes, (c) volatility on investment horizon of 10-20 
minutes, (d) volatility on investment horizon of 20 — 40 minutes, (e) volatility on investment horizon of 40-80 minutes, 
(f) volatility on investment horizon of 80 minutes up to 1 day. Note that sum of components (b), (c), (d), (e) and 
(f) give total volatility plotted in (a). 



consistently estimate jumps from the price process. It is robust to noise and it generates an unbi- 
ased consistent estimator of the variation of the latent process. The theoretical part also contains 
an important discussion of the similarities between wavelet theory and stochastic processes. 

To support the theory, a numerical study of the small sample performance of the estimators is 
carried out. In this study, we compare our estimators to several of the most popular estimators, 
namely, realized variance, bipower variation, two-scale realized volatility and realized kernels. The 
wavelet-based estimator proves to have lowest bias of all the estimators in the jump-diffusion model 
with stochastic volatility as well as the fractional stochastic volatility model simulated with different 
levels of noise and numbers of jumps. While all the other estimators suffer from substantial bias 
caused either by jumps or by noise, our theory proves to hold its properties under both noise 
and jumps. As predictability of volatility is of interest to researchers as well as practitioners, a 
numerical study of the behavior of the forecasts is also carried out. Again, our theory proves to be 
the most powerful in forecasting volatility under the different simulation settings. 

The last chapter uses the theory to decompose the empirical volatility. By studying the statis- 
tical properties of unconditional daily log-return distributions standardized by volatility estimated 
using the different estimators we find that standardization by our wavelet-based estimator brings 
the returns close to the Gaussian normal distribution. All the other estimators are affected by 
the presence of jumps in the data. The differences are economically significant, as we find that 
the average volatility estimated using our wavelet-based theory is 6.34% lower than the volatility 
estimated with the standard estimator. 

Concluding the empirical findings, we show that our wavelet-based theory brings a significant 
improvement to volatility estimation. It also offers a time-frequency way of realized volatility mea- 
surement which helps us to better understand the dynamics of stock market behavior. Specifically, 
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our theory uncovers that most of the volatihty is created on higher frequencies. 
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Appendix A. Technical proofs 

Appendix A.l. Wavelets introduction 

For a family of Daubechies wavelets D{L), where L is assumed to be even, the scaling filter 
mo(^) associated with the scaling function ip has the following form: 

moiO = [-Tjr) ^(0, (A.l) 
where denotes a trigonometric polynomial of order {L/2) — 1 such that the following holds: 

m)f = PL/2 (sin^ (0) and £(0) = 1, (A.2) 
where Pl/2 is a polynomial of order (L/2) — 1, 

(L/2)-l , . X {L/2)-l , . 

n=0 ^ ik=0 ^ ^ 

The Daubechies wavelet D{L) has L/2 vanishing moments, i.e., 

/oo 
x''i^{x)dx = k = 0,...,{L/2)-l (A.4) 
-oo 

Furthermore, for {L/2) > 2 the Daubechies wavelets are continuous with a compact support of 
length L, hence for a wavelet and scaling function we can write: 

suppV' = (-(V2) + 1,(^2)) supp(/) = (0,L-1). (A.5) 
For L = 2, the length of a Daubechies wavelet is 2, and it corresponds to the Haar wavelet. 

Appendix A.2. Daubechies wavelets 

The squared gain function for the associated Daubechies family wavelet filter is defined as: 

L/2~l , , 

Hf ^ 2sin^(7r/) J] (^/' " ' + ' ) cos^'(7r/). (A.6) 
1=0 ^ ^ 

The squared gain function for the corresponding Daubechies family scaling filter can be obtained 

as 

Gi,Lif)^ni,Lil/2-f), (A.7) 

hence we get 

L/2-1 . , 

ai,^ ^ 2cos^(7r/) r''' 1 ' ^ J (A-8) 
1=0 ^ ^ 
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Note that at the first level wavelet decomposition J = 1 the wavelet filter is an approximation of 
an ideal high-pass filter |/| G [1/4, 1/2], i.e., only the high-frequency part of the signal (or process) 
passes through the filter. Conversely, the scaling filter is an approximation of an ideal low-pass 
filter I/I G [0, 1/4], so only the low- frequency part of the signal passes through. 

The Daubechies D(4) DWT coefficients for the scaling filter g and for the wavelet filter h have 
the following values: 



50 



lW3 
4^2 



-> 91 



3 + ^/3 
4^2 



92 



3-^/3 
4^2 



1 



, 1-^3, -3 + \/3, 3 + ^3, 

"-0 = — , /- ) Hi = —?= — , ^2 — — , f- , Us = 



4^2 



4^/2 



4^/2 



4^2 ' 

-i-Vs 

4^/2 



(A.9) 
(A.IO) 



Appendix A. 3. Proof of Theorem 1 

The scaling filter mo(^) associated with the scaling function il) for the Daubechies D(4) wavelet 
has the following form: 



mo(C) 



l + e'' 



where JC{^) denotes a trigonometric polynomial of order 1 such that the following holds: 



P2 [ sm" - J and C(0) = 1, 



where p2 represents a trigonometric polynomial of order 1. Prom A. 3 we easily get: 

P2{x) = 1 -I- 2x. 
Furthermore, there exists a constant C such that 

m)\ < i+ciei- 

Hence from A. 11 it follows that 

■l + e^«^' 



\mo{0\' 



P2 sm 



1 



(l + cosO'(2-cose) 



Further, we get 



mo 



+ 7r 



1 — cos 



2 -I- cos 



mo 



+ 7r 



(A.ll) 



(A.12) 



(A.13) 



(A.14) 



(A.15) 



(A.16) 



Since the Daubechies D{4) wavelet is compactly supported, it can be used for construction of 
the multiresolution analysis with scaling function (p G L^(M). Then, using the Fourier transform, 
we define function ip, which is the wavelet associated with this multiresolution analysis (Mallat, 
1998) 



mo 
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(A.17) 



The squared gain function of the wavelet takes the form 



mo ( I + vr 



(A.18) 



The Fourier transform of a scale function (p takes the form Igari (1998, p. 231) 

+00 

1^(01 = n cos2(7r2-'=OI^(2-'OI <Ci{l + \^\r'\^r, ^ G Z, 
k=l 



(A.19) 



where a = log 3/(2 log 2) Ri 0,7925 and Ci is a constant independent of ^ (Igari, 1998). Therefore, 
the following estimate holds: 



mo ( I + TT 



<cUi + 



2a 



{i + c icif . 



(A.20) 



From A.20 we can easily show that the squared gain function of the wavelet '(/'(O finite, i.e.. 



+00 



2 1 , 

ds = / ... + 
s Jo Jl 



< +00. 



(A.21) 



To complete the proof we have to show that the following equality holds: 



(A.22) 



It is sufficient to show that a similar relation holds for the squared gain function of the scaling 
function </'(.), i.e.. 





2 















Then, A. 23 follows from the formula 



^(0 = 11^0 (I), 
.7=1 ^ 



and from the equality 
Since, 



This implies that the extra conditions hold, i.e.. 



r+00 
Jo 



2 1 



-ds 

S .10 



+00 



i'i-s] 



2 1 1 

-ds = -C^. 
s 2 ^ 



(A.23) 

(A.24) 

(A.25) 
(A.26) 

(A.27) 

□ 
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Appendix A. 4- Proof of Proposition 2 

This proof readily follows from the Calderon reconstruction formula, Theorem 1 , and Proposi- 
tion 1, where we specify exact conditions for the Daubechies D{4) wavelet. 

□ 

Appendix A. 5. Proof of Proposition 4 

For a time series Xj and its discrete Fourier transform A'^, with frequency = k/N, we get 
the following relationship (using Parscval's theorem, see e.g. Percival and Walden, 2000, p. 72): 



N-l 



2 1 



(A.28) 



fe=0 



where \Xk\^ /N establishes an energy spectrum at frequencies /fe = k/N. 



For a time series Xi , using Parseval's theorem we can write for the j-th level MODWT wavelet 
and scaling vectors of coefficients: 



fc=0 



Summation of the vectors yields: 

l|W,f + ||V,f = 1 X; \Xk\\mk/N)\' + \G,{klN)\^). 



N-l 



(A.29) 
(A.30) 

(A.31) 



A;=0 



Following Percival and Mofjeld (1997) and Percival and Walden (2000) we can write for any 

i>2 



N-l 



fc=o 

k=0 1=0 1=0 

1 -^"^ ~ ~ ■'"^ 



k=0 
N-l 



= ^ E l-^^l' (^(2^"' W + a(2^-iA;/iV)) 
fe=o 



G'i-i(A^/iV) 



fe=o 



IV- ilP 

''.7-1 > 



(A.32) 
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where we have used the fact that 



G {2^-^k/N) ^ = g {2^-^k/N) 



and 



m) + Gif) = 1, for ah /. 
Using the above result for j > 2, by induction we obtain 

jm 

||Vif = \\Wjf + llVjmf , J"* > 2, 

J=2 



(A.33) 
(A.34) 

(A.35) 
(A.36) 



which in fact says that we can decompose ||V2p further to higher levels. To complete the proof 
of the energy decomposition using the MODWT wavelet and scaling coefficients we have to show 
that the following holds (Percival and Walden, 2000): 



IXII 



|Wi||^ + ||Vi||2. 



(A.37) 



Using Parseval's theorem we can write the vectors of the MODWT wavelet and scaling coefficients 
at the first level in the following form: 



Af-l 



IWil 



|Vi| 



^ J2 \H,ik/N)\W 

-Y^nik/N)\Xkf, 

- J2 \Gi{k/N)\W 



(A.38) 



AT 



fe=0 
N-l 



Summation of the vectors yields 



|Wif + llVif 



fe=0 



iV-l 



= J^J2i^WN) + g{k/N))\Xk\ 



(A.39) 



(A.40) 



k=0 



With Hif) + g{f) = 1 for all / we obtain 



N-l 



(A.41) 



fc=0 



Hence, using A.41 we finally get the energy decomposition of a time series (Xt)jgjo,Ar-i] 
maximum scale J > 1 

||Xf = ^ ||Wjf + llVjmf . (A.42) 

□ 
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Appendix A. 6. Proof of Proposition 5 and Proposition 6 

(WRV) 

The proof of the unbiasedness and consistency of the RV^ y^ estimator comes readily from 
the introduced theory. Let us summarize the logic in the following lines. 

-{sparse) 

Prom definition 15 and properties of the RV^ f^ estimator (Andersen et al., 2003), we know 
that RV^f^ = Y17=i''^'i_h-i-f jL)h unbiased and consistent estimator of RVt^h = Jt_h^sd^ 



on [t - h,t], as E [RVt^h\J^t] = E 



' A sparse) , 



and plim^ 



-^^^{sparse) 



RVt fi with uniform 



convergence in probability. 

Moreover, from Proposition 27 we know that we can conveniently decompose the energy of the 
process rt^h over [t — h,t], for < h < t < T, using the MODWT coefficients (proof in Appendix 
Appendix A.5). Using equation 29 we can define the intraday returns over [t — h,t] as 



As 



estimator for J" 



> 1: 



, we can directly apply this decomposition to the realized variance 



(sparse) 

RVti 



1=1 



3=1 k=l 



(A.44) 



T\msE[RVt,h\H=E 



' A sparse) , 



E 



(WRV), 



Moreover, based on the wavelet representation theorem in Proposition 2 proved in Appendix 

{sparse) , {WRV) rt o , , -—-{WRV) 

" provides a 



Appendix A. 4, plim^^^ RVlJ"" = p\im^^^ RVlj"' ' = ft_^a'^ds and RVt,, 
consistent estimator with increasing sampling frequency n — )■ oo. 



□ 



Appendix A. 7. Proof of Proposition 9 and Proposition 10 

All of the theory has been actually proved, we just bring it together in a new estimator. Thus 
we describe the logic of the proof here. 

Consider a log-price process {pt)te[Q,T\ that is contaminated with noise, i.e., yt=Pt + ^t-, where 
(2/t)te[o,T] is the observed log-price process. Moreover, pt follows a jump-diffusion process 



dpt = ntdt + atdWt + ^tdqt, 



(A.45) 



where g is a Poisson process uncorrelated with W and governed by the constant jump intensity 
A. The magnitude of the jump in the return process is controlled by factor ~ Its 
quadratic return variation over [t — h,t], for 0<h<t<T,is 



QVt,h 



t-h 



Gg ds + 



IVt 




(A.46) 



t,h 



Jump Variation 
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Fan and Wang (2007) show that integrated variance and jump variation can be separated using 
the wavelet estimator. The authors prove that the WJV estimator can consistently estimate the 
jump variation part, and that the jump-adjusted price y^"^^ = yt^h — WJV converges in probability 
to its theoretical, continuous counterpart at a convergence rate of ri~^/^. 

We have proved that the RV^ estimator is able to estimate integrated variance consistently 
(Proposition 5 and Proposition 6 proved in Appendix Appendix A.6). Thus, the realized variance 
of process y^'^j^ can be estimated using the energy decomposition of process: 



1=1 ^ ' j=\ h=\ 



where W ^_t , fe^ are wavelet coefficients estimated using the MODWT (the proof is exactly the 
same as the one in Appendix A.6). 

Using Zhang et al. (2005)'s TSRV estimator, we consistently estimate the integrated variance 
of pt from the noisy observed data yt- 

mi'lr' = ^jT^-l ^ , (A.48) 

slow time scale fast time scale 

-{average) 

where RV^f^ is the average of the RV estimators on a grid (see Zhang et al. (2005) for a 

full explanation). Finally, replacing the realized variance with the decomposition using wavelet 
realized variance on y^'^j^ from Equation A.47 and putting it into the TSRV, we get 

^(JWTSRV) ^iWRV,J) n^{all,J) 

RVt,h =Ryt,h ~n , (A.49) 

where RV^ i^ ' ^ ~ h S^=i '^j=i~^ Sfc=i h * /i obtained from the wavelet coefficient esti- 



mates on a grid of size n = n/G on the jump-adjusted observed data, y^'2 = yt,h — Jl- 

(WRVJ) 

We have shown that RVf ^ converge; 

et al. (2005) provide the proof for the TSRV 



(WRVJ) 

We have shown that RV^ ^ converges to the integrated variation of process yt^h, and Zhang 

□ 
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